{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copyright (C) 2023 Arm Limited or its affiliates. All rights reserved.\n",
    "#\n",
    "# SPDX-License-Identifier: Apache-2.0\n",
    "#\n",
    "# Licensed under the Apache License, Version 2.0 (the License); you may\n",
    "# not use this file except in compliance with the License.\n",
    "# You may obtain a copy of the License at\n",
    "#\n",
    "# www.apache.org/licenses/LICENSE-2.0\n",
    "#\n",
    "# Unless required by applicable law or agreed to in writing, software\n",
    "# distributed under the License is distributed on an AS IS BASIS, WITHOUT\n",
    "# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "# See the License for the specific language governing permissions and\n",
    "# limitations under the License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DNN_Medium - Optimised\n",
    "\n",
    "Here we reproduce the models with our established codebase and ModelPackage approach for your convenience.\n",
    "\n",
    "## Model-Package Overview:\n",
    "\n",
    "| Model           \t| DNN_Medium                            \t|\n",
    "|:---------------:\t|:---------------------------------------------------------------:\t|\n",
    "| <u>**Format**</u>:          \t| Keras, Saved Model, TensorFlow Lite int8, TensorFlow Lite fp32 |\n",
    "| <u>**Feature**</u>:         \t| Keyword spotting for Arm Cortex-M CPUs |\n",
    "| <u>**Architectural Delta w.r.t. Vanilla**</u>: | None |\n",
    "| <u>**Domain**</u>:         \t| Keyword spotting |\n",
    "| <u>**Package Quality**</u>: \t| Optimised |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Table of contents <a name=\"index_page\"></a>\n",
    "\n",
    "This how-to guidance presents the key steps to reproduce everything in this package. The contents are organised as below. We provided the internal navigation links for users to easy-jump among different sections.  \n",
    "\n",
    "    \n",
    "* [1.0 Model recreation](#model_recreation)\n",
    "\n",
    "* [2.0 Training](#training)\n",
    "\n",
    "* [3.0 Testing](#testing)\n",
    "\n",
    "* [4.0 Optimization](#optimization)\n",
    "\n",
    "* [5.0 Quantization and TFLite conversion](#tflite_conversion)\n",
    "\n",
    "* [6.0 Inference the TFLite model files](#tflite_inference)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.0 Model Recreation<a name=\"model_recreation\"></a>\n",
    "\n",
    "In order to recreate the model you will first need to be using ```Python3.7``` and install the requirements in ```requirements.txt```.\n",
    "\n",
    "Once you have these requirements satisfied you can execute the recreation script contained within this folder, just run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-01-31 13:21:58.189962: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "Untarring speech_commands_v0.02.tar.gz...\n",
      "2023-01-31 13:22:48.489206: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1\n",
      "2023-01-31 13:22:48.528844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:22:48.528880: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "2023-01-31 13:22:48.548795: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11\n",
      "2023-01-31 13:22:48.548866: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11\n",
      "2023-01-31 13:22:48.551645: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10\n",
      "2023-01-31 13:22:48.551935: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10\n",
      "2023-01-31 13:22:48.552501: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11\n",
      "2023-01-31 13:22:48.553238: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11\n",
      "2023-01-31 13:22:48.553392: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8\n",
      "2023-01-31 13:22:48.553886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:22:48.554176: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA\n",
      "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2023-01-31 13:22:48.554998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:22:48.555410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:22:48.555527: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "2023-01-31 13:22:48.994481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:22:48.994520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:22:48.994528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:22:48.995028: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10939 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n",
      "2023-01-31 13:22:50.146418: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.\n",
      "2023-01-31 13:22:50.411740: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1\n",
      "2023-01-31 13:22:50.411969: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session\n",
      "2023-01-31 13:22:50.412348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:22:50.412596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:22:50.412627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:22:50.412636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:22:50.412643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:22:50.412919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10939 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "2023-01-31 13:22:50.431567: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3492140000 Hz\n",
      "2023-01-31 13:22:50.433318: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1144] Optimization results for grappler item: graph_to_optimize\n",
      "  function_optimizer: function_optimizer did nothing. time = 0.017ms.\n",
      "  function_optimizer: function_optimizer did nothing. time = 0.003ms.\n",
      "\n",
      "2023-01-31 13:22:50.470457: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:345] Ignored output_format.\n",
      "2023-01-31 13:22:50.470496: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:348] Ignored drop_control_dependency.\n",
      "2023-01-31 13:22:50.473049: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.\n",
      "2023-01-31 13:22:50.475051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:22:50.475342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:22:50.475376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:22:50.475387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:22:50.475395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:22:50.475693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10939 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "Converted model saved to dnn.tflite.\n",
      "Running TFLite evaluation on validation set...\n",
      "2023-01-31 13:22:50.520336: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)\n",
      "[[371   0   0   0   0   0   0   0   0   0   0   0]\n",
      " [  0 265   9   7   5  18  11  12  17   5   8  14]\n",
      " [  0   6 346   9   0   2  22   5   1   0   1   5]\n",
      " [  0   9   8 323   8  14   3   5   0   1   2  33]\n",
      " [  0   4   0   2 304   1   3   3   4  17   9   3]\n",
      " [  0   8   1  19   1 326   2   1   7   0   0  12]\n",
      " [  0   2  24   2   3   1 304  13   0   0   0   3]\n",
      " [  0  10   1   1   4   1   4 336   1   2   0   3]\n",
      " [  1  10   1   1   7   2   0   2 326   9   1   3]\n",
      " [  1   2   0   1  27   0   1   1  11 321   4   4]\n",
      " [  2   5   0   0  16   2   2   1   1   2 318   1]\n",
      " [  0  13   0  43   6  13   1   2   3   3   1 287]]\n",
      "Validation accuracy = 86.10%(N=4445)\n",
      "Running TFLite evaluation on test set...\n",
      "[[408   0   0   0   0   0   0   0   0   0   0   0]\n",
      " [  0 295   7  11   6   6  13  12  24   8   5  21]\n",
      " [  0  12 380   3   0   4  15   1   0   0   0   4]\n",
      " [  1  11   2 332   0  22   1   0   0   0   0  36]\n",
      " [  0  14   1   2 357   2   2   5  12  11  11   8]\n",
      " [  0  18   5  18   6 329   5   1   4   0   2  18]\n",
      " [  0  10  25   3   4   1 347  15   1   0   2   4]\n",
      " [  0  20   1   0   5   1  14 349   1   5   0   0]\n",
      " [  0  12   0   1   5   9   0   0 347  16   2   4]\n",
      " [  0  15   0   1  15   1   5   2  12 339   3   9]\n",
      " [  0   5   0   3  21   2   4   1   2   1 368   4]\n",
      " [  0  10   1  62   8  13   3   1   0   0   1 303]]\n",
      "Test accuracy = 84.95%(N=4890)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2023-01-31 13:23:02.712653: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "Untarring speech_commands_v0.02.tar.gz...\n",
      "2023-01-31 13:23:53.488800: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1\n",
      "2023-01-31 13:23:53.524175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:23:53.524209: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "2023-01-31 13:23:53.544183: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11\n",
      "2023-01-31 13:23:53.544253: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11\n",
      "2023-01-31 13:23:53.546889: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10\n",
      "2023-01-31 13:23:53.547146: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10\n",
      "2023-01-31 13:23:53.547744: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11\n",
      "2023-01-31 13:23:53.548454: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11\n",
      "2023-01-31 13:23:53.548596: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8\n",
      "2023-01-31 13:23:53.548947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:23:53.549238: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA\n",
      "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2023-01-31 13:23:53.549958: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:23:53.550439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:23:53.550510: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n",
      "2023-01-31 13:23:53.960933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:23:53.960972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:23:53.960979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:23:53.961483: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10940 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n",
      "2023-01-31 13:23:55.053376: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.\n",
      "2023-01-31 13:23:55.321894: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1\n",
      "2023-01-31 13:23:55.322084: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session\n",
      "2023-01-31 13:23:55.322539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:23:55.322808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:23:55.322839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:23:55.322850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:23:55.322858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:23:55.323143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10940 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "2023-01-31 13:23:55.347442: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3492140000 Hz\n",
      "2023-01-31 13:23:55.348486: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1144] Optimization results for grappler item: graph_to_optimize\n",
      "  function_optimizer: function_optimizer did nothing. time = 0.011ms.\n",
      "  function_optimizer: function_optimizer did nothing. time = 0.002ms.\n",
      "\n",
      "2023-01-31 13:23:55.387556: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:345] Ignored output_format.\n",
      "2023-01-31 13:23:55.387602: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:348] Ignored drop_control_dependency.\n",
      "2023-01-31 13:23:55.390277: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.\n",
      "2023-01-31 13:23:55.392318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: \n",
      "pciBusID: 0000:03:00.0 name: NVIDIA TITAN Xp computeCapability: 6.1\n",
      "coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s\n",
      "2023-01-31 13:23:55.392627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0\n",
      "2023-01-31 13:23:55.392665: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:\n",
      "2023-01-31 13:23:55.392681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 \n",
      "2023-01-31 13:23:55.392693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N \n",
      "2023-01-31 13:23:55.393015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10940 MB memory) -> physical GPU (device: 0, name: NVIDIA TITAN Xp, pci bus id: 0000:03:00.0, compute capability: 6.1)\n",
      "2023-01-31 13:23:55.414179: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)\n",
      "fully_quantize: 0, inference_type: 6, input_inference_type: 9, output_inference_type: 9\n",
      "Quantized model saved to dnn_quantized.tflite.\n",
      "Running TFLite evaluation on validation set...\n",
      "[[371   0   0   0   0   0   0   0   0   0   0   0]\n",
      " [  0 272   6   8   8  19   9  12  17   6   4  10]\n",
      " [  0  11 341   9   5   2  20   6   0   0   0   3]\n",
      " [  0  15   9 319  13  13   2   4   1   1   3  26]\n",
      " [  0   6   0   3 307   1   1   2   3  16   9   2]\n",
      " [  0  11   1  20  12 312   3   0   6   0   1  11]\n",
      " [  0   7  26   3   5   1 294  11   1   1   1   2]\n",
      " [  0  13   1   1   9   2   5 326   1   1   2   2]\n",
      " [  2  13   0   0   7   4   1   2 318  10   4   2]\n",
      " [  1   4   0   2  37   0   1   2  12 308   3   3]\n",
      " [  2   5   0   0  21   2   2   1   1   3 312   1]\n",
      " [  0  16   1  43   9  15   1   3   1   3   1 279]]\n",
      "Validation accuracy = 84.57%(N=4445)\n",
      "Running TFLite evaluation on test set...\n",
      "[[408   0   0   0   0   0   0   0   0   0   0   0]\n",
      " [  0 303   7  13   6   4  12   9  22   8   6  18]\n",
      " [  0  13 370   5   4   3  15   1   1   0   2   5]\n",
      " [  0  12   6 335   4  19   1   1   1   0   0  26]\n",
      " [  0  14   1   4 354   1   0   3  15  14  11   8]\n",
      " [  0  26   5  26  10 316   5   2   3   0   1  12]\n",
      " [  0  15  25   2   9   1 334  17   1   0   2   6]\n",
      " [  0  19   1   0  10   1  14 338   4   4   4   1]\n",
      " [  0  16   1   2   8   8   1   0 339  11   6   4]\n",
      " [  0  15   0   1  27   0   6   2  12 329   3   7]\n",
      " [  0   9   0   3  22   2   4   1   2   2 360   6]\n",
      " [  0  20   0  63  16  12   1   3   1   1   6 279]]\n",
      "Test accuracy = 83.13%(N=4890)\n"
     ]
    }
   ],
   "source": [
    "!bash ./recreate_model.sh"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Running this script will use the pre-trained checkpoint files supplied in the ```./model_archive/model_source/weights``` folder to generate the TFLite files and perform evaluation on the test set. Both an fp32 version and a quantized version will be produced. The quantized version will use post-training quantization to fully quantize it.\n",
    "\n",
    "If you want to run training from scratch you can do this by supplying ```--train``` when running the script. For example:\n",
    "\n",
    "```bash\n",
    "bash ./recreate_model.sh --train\n",
    "```\n",
    "\n",
    "Training is then performed and should produce a model to the stated accuracy in this repository. Note that exporting to TFLite will still happen with the baseline pre-trained checkpoint files, so you will need to re-run the script and this time supply the path to the new checkpoint files you want to use, for example:\n",
    "\n",
    "```bash\n",
    "bash ./recreate_model.sh --ckpt <checkpoint_path>\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.0 Training<a name=\"training\"></a>\n",
    "\n",
    "The training scripts can be used to recreate any of the models from the [Hello Edge paper](https://arxiv.org/pdf/1711.07128.pdf) provided the right hyperparameters are used. The training commands with all the hyperparameters to reproduce the model in this repository are given [here](recreate_model.sh). The model in this part of the repository represents just one variation of the models from the paper, other varieties are covered in other parts of the repository.\n",
    "\n",
    "\n",
    "As a general example of how to train a DNN with 3 fully-connected layers with 128 neurons in each layer, run:\n",
    "```\n",
    "python train.py --model_architecture dnn --model_size_info 128 128 128\n",
    "```\n",
    "\n",
    "The command line argument *--model_size_info* is used to pass the neural network layer\n",
    "dimensions such as number of layers, convolution filter size/stride as a list to models.py,\n",
    "which builds the TensorFlow graph based on the provided model architecture\n",
    "and layer dimensions. For more info on *model_size_info* for each network architecture see\n",
    "[models.py](model_core_utils/models.py).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.0 Testing<a name=\"testing\"></a>\n",
    "To run inference on the trained model from a checkpoint and get accuracy on validation and test sets, run:\n",
    "```\n",
    "python evaluation.py --model_architecture dnn --model_size_info 128 128 128 --checkpoint <checkpoint_path>\n",
    "```\n",
    "**The model and feature extraction parameters passed to this script should match those used in the Training step.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.0 Optimization<a name=\"optimization\"></a>\n",
    "\n",
    "We introduce an *optional* step to optimize the trained keyword spotting model for deployment.\n",
    "\n",
    "Here we use TensorFlow's [weight clustering API](https://www.tensorflow.org/model_optimization/guide/clustering) to reduce the compressed model size and optimize inference on supported hardware. 32 weight clusters and kmeans++ cluster intialization method are used as the clustering hyperparameters.\n",
    "\n",
    "To optimize your trained model (e.g. a DNN), a trained model checkpoint is needed to run clustering and fine-tuning on.\n",
    "You can use the pre-trained checkpoints provided, or train your own model and use the resulting checkpoint.\n",
    "\n",
    "To apply the optimization and fine-tuning, run the following command:\n",
    "```\n",
    "python optimisations.py --model_architecture dnn --model_size_info 128 128 128 --checkpoint <checkpoint_path>\n",
    "```\n",
    "**The model and feature extraction parameters used here should match those used in the Training step, except for the number of training steps.\n",
    "The number of training steps is reduced since the optimization step only requires fine-tuning.**\n",
    "\n",
    "This will generate a clustered model checkpoint that can be used in the quantization step to generate a quantized and clustered TFLite model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.0 Quantization and TFLite Conversion<a name=\"tflite_conversion\"></a>\n",
    "\n",
    "You can now use TensorFlow's\n",
    "[post training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization) to\n",
    "make quantization of the trained models super simple.\n",
    "\n",
    "To quantize your trained model (e.g. a DNN) run:\n",
    "```\n",
    "python convert_to_tflite.py --model_architecture dnn --model_size_info 128 128 128 --checkpoint <checkpoint_path> [--inference_type int8|int16]\n",
    "```\n",
    "**The model and feature extraction parameters used here should match those used in the Training step.**\n",
    "\n",
    "The ```inference_type``` parameter is *optional* and to be used if a fully quantized model with inputs and outputs of type int8 or int16 is needed. It defaults to fp32.\n",
    "\n",
    "In this example, this step will produce a quantized TFLite file *dnn_quantized.tflite*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can test the accuracy of this quantized model on the test set by running:\n",
    "```\n",
    "python evaluation.py --tflite_path dnn_quantized.tflite\n",
    "```\n",
    "**The model and feature extraction parameters used here should match those used in the Training step.**\n",
    "\n",
    "`convert_to_tflite.py` uses post-training quantization to generate a quantized model by default. If you wish to convert to a floating point TFLite model, use the command below:\n",
    "\n",
    "```\n",
    "python convert_to_tflite.py --model_architecture dnn --model_size_info 128 128 128 --checkpoint <checkpoint_path> --no-quantize\n",
    "```\n",
    "\n",
    "This will produce a floating point TFLite file *dnn.tflite*. You can test the accuracy of this floating point model using `evaluation.py` as above.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6.0 Single inference of the TFLite model files <a name=\"tflite_inference\"></a>\n",
    "\n",
    "You can conduct TFLite inference for .fp32 and .int8 model files by using the following command: \n",
    "\n",
    "```python dnn_m_inference_tflite.py --labels validation_utils/labels.txt --wav <path_to_wav_file> --tflite_path <path_to_tflite_file>```\n",
    "\n",
    "**The feature extraction parameters used here should match those used in the Training step.**\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
